perf: reduce ConcurrentDictionary closure allocations in hot paths by thomhurst · Pull Request #5210 · thomhurst/TUnit

thomhurst · 2026-03-22T12:10:56Z

Summary

TestExecutionGuard: Add TryGetValue fast path before GetOrAdd to avoid allocating a TaskCompletionSource<bool> on every call when a test is already executing
EventReceiverOrchestrator: Add TryGetValue fast paths before GetOrAdd calls for first-test-in-session/assembly/class tasks, avoiding closure allocations from captured context, cancellationToken, and other state on the hot path. Also added fast paths for assembly/class test count counters to reduce lock contention (these use static lambdas so the savings are contention-related, not closure-related)
HookDelegateBuilder: Mark GetCachedGenericTypeDefinition lambda as static to prevent implicit closure allocation

Rationale

Profiling shows ~4.1% exclusive CPU in ConcurrentDictionary operations (TryAddInternal, AcquireAllLocks). These changes reduce allocations and contention by:

Checking TryGetValue before GetOrAdd so closures capturing state are never created on the fast path (when key already exists)
Avoiding unnecessary TaskCompletionSource allocations in the execution guard
Reducing lock contention on counter dictionaries that use static lambdas (already cached by the compiler, but GetOrAdd still acquires internal locks)
Using static lambdas where the lambda body doesn't need captured state

No behavioral changes — only allocation and contention reduction. Thread safety guarantees are preserved since TryGetValue is lock-free on ConcurrentDictionary and the subsequent GetOrAdd handles the race correctly.

Test plan

Verify all existing tests pass (no behavioral changes)
Confirm build succeeds across all target frameworks (net8.0, net9.0, net10.0, netstandard2.0)

Add TryGetValue fast paths before GetOrAdd calls to avoid closure allocations and unnecessary ConcurrentDictionary contention on the hot path. Key changes: - TestExecutionGuard: avoid allocating TaskCompletionSource when a test is already executing by checking TryGetValue first - EventReceiverOrchestrator: add TryGetValue fast paths before GetOrAdd for first-test-in-session/assembly/class event tasks and for assembly/class test count counters, avoiding closure allocations from captured state - HookDelegateBuilder: mark GetCachedGenericTypeDefinition lambda as static to prevent implicit closure allocation

claude

Code Review

Overall, this is a well-reasoned performance optimization PR. The changes are generally correct and the approach is sound. Here are my findings:

✅ — lambda

Clean, correct change. Adding is self-documenting intent and prevents any future accidental closure introduction if the lambda is modified. No concerns.

✅ — TryGetValue fast path

The fast path is correct. Worth noting the subtle thread-safety contract here: calls / before (in ), so any TCS retrieved via is guaranteed to be either:

Still in-progress (await will block until completion), or
Already completed (await returns immediately)

Either way, behaves correctly. The optimization is valid — it avoids allocating a on the common path when the test is already tracked.

✅ — First-test event task fast paths

This is the highest-impact change. The closures in do capture , //, and — so every call without the fast path allocates a new closure object on the heap. The bypass avoids this on the hot path (i.e., every test after the first in a session/assembly/class). Correct and well-targeted.

⚠️ Minor: PR description slightly overstates savings for counter dictionaries

In and , the pre-existing lambdas are . In C#, lambdas with no captures compile to cached static delegate fields — the delegate itself is never heap-allocated on repeated calls. So the bypass here does not save a closure/delegate allocation; it only saves the internal hash-table lookup and potential lock acquisition.

This is still a valid win under contention, but the PR description frames these as "closure allocation" savings when they're really "lock contention" savings. Not a correctness issue — just worth being precise for future readers.

🔍 — cold path concern (pre-existing, not introduced here)

The counter dicts (, ) are not cleared during , unlike the first-test task dicts. If is ever called more than once (e.g., for re-runs), the accumulates on top of existing values. This pre-dates this PR, but the refactored code now makes this more visible. Worth confirming the lifecycle contract — is always called exactly once, or should these counters also be reset?

Summary

The core optimization is sound and the implementation is correct. The -before- pattern is a well-established idiom for hot paths. The most impactful part is the first-test event closure avoidance, and the TCS avoidance is clean. The only non-trivial note is the potential double-count concern, which pre-exists this PR but is worth confirming.

claude

Code Review

Overall, this is a well-reasoned performance optimization PR. The changes are generally correct and the approach is sound. Here are my findings:

HookDelegateBuilder.cs — static lambda

Clean, correct change. Adding static is self-documenting intent and prevents any future accidental closure introduction if the lambda is modified. No concerns.

TestExecutionGuard.cs — TryGetValue fast path

The fast path is correct. Worth noting the subtle thread-safety contract here: ExecuteAndCompleteAsync calls SetResult/SetException before TryRemove (in finally), so any TCS retrieved via TryGetValue is guaranteed to be either in-progress (await blocks) or already completed (await returns immediately). Either way, WaitForExistingExecutionAsync behaves correctly.

The optimization is valid — it avoids allocating a TaskCompletionSource<bool> on the common path when the test is already tracked.

EventReceiverOrchestrator.cs — First-test event task fast paths

This is the highest-impact change. The GetOrAdd closures in InvokeFirstTestIn{Session,Assembly,Class}EventReceiversAsync do capture context, assemblyContext/classContext/sessionContext, and cancellationToken — so every call without the fast path allocates a new closure object on the heap. The TryGetValue bypass avoids this on the hot path (every test after the first in a session/assembly/class). Correct and well-targeted.

Minor: PR description slightly overstates savings for counter dictionaries

In InvokeLastTestIn{Assembly,Class}EventReceiversAsync and InitializeTestCounts, the pre-existing lambdas are static _ => new Counter(). In C#, static lambdas with no captures compile to cached static delegate fields — the delegate itself is never heap-allocated on repeated calls. So the TryGetValue bypass here does not save a closure/delegate allocation; it only saves the GetOrAdd internal hash-table lookup and potential lock acquisition under contention.

This is still a valid win, but the PR description frames these as "closure allocation" savings when they're really "lock contention" savings. Not a correctness issue — just worth being precise for future readers.

InitializeTestCounts — cold path concern (pre-existing, not introduced here)

The counter dicts (_assemblyTestCounts, _classTestCounts) are not cleared during InitializeTestCounts, unlike the first-test task dicts. If InitializeTestCounts is ever called more than once (e.g., re-runs), counter.Add(group.Count()) accumulates on top of existing values. This pre-dates this PR, but the refactored pattern makes it more visible. Worth confirming the lifecycle contract — is InitializeTestCounts always called exactly once, or should these counters also be reset?

Summary

The core optimization is sound and the implementation is correct. The TryGetValue-before-GetOrAdd pattern is a well-established idiom for ConcurrentDictionary hot paths. The most impactful change is the first-test event closure avoidance; the TestExecutionGuard TCS avoidance and the static lambda are clean wins too. The only non-trivial note is a pre-existing potential double-count concern in InitializeTestCounts worth confirming.

hartmair · 2026-03-22T19:03:18Z

TUnit.Engine/Services/EventReceiverOrchestrator.cs

+            return new ValueTask(existingTask);
+        }
+
        var task = _firstTestInSessionTasks.GetOrAdd("session",


@thomhurst FYI there is also an overload for this, which internally does almost exactly the same, see ConcurrentDictionary.cs:1223.

var task = _firstTestInSessionTasks.GetOrAdd("session", static (_, args) => InvokeFirstTestInSessionEventReceiversCoreAsync(args.Item1, args.Item2, args.Item2), (context, sessionContext, cancellationToken)); return new ValueTask(task);

Thanks @hartmair !

thomhurst temporarily deployed to Pull Requests March 22, 2026 12:11 — with GitHub Actions Inactive

thomhurst had a problem deploying to Pull Requests March 22, 2026 12:11 — with GitHub Actions Failure

claude bot reviewed Mar 22, 2026

View reviewed changes

thomhurst enabled auto-merge (squash) March 22, 2026 12:29

thomhurst merged commit 4961754 into main Mar 22, 2026
15 of 16 checks passed

thomhurst deleted the perf/reduce-concurrent-dict-allocations branch March 22, 2026 12:34

hartmair reviewed Mar 22, 2026

View reviewed changes

thomhurst mentioned this pull request Mar 22, 2026

perf: use GetOrAdd args overload to eliminate closure allocations in event receivers #5222

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: reduce ConcurrentDictionary closure allocations in hot paths#5210

perf: reduce ConcurrentDictionary closure allocations in hot paths#5210
thomhurst merged 1 commit intomainfrom
perf/reduce-concurrent-dict-allocations

thomhurst commented Mar 22, 2026 •

edited

Loading

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

Uh oh!

hartmair Mar 22, 2026

Uh oh!

thomhurst Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

thomhurst commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Rationale

Test plan

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Code Review

✅ — lambda

✅ — TryGetValue fast path

✅ — First-test event task fast paths

⚠️ Minor: PR description slightly overstates savings for counter dictionaries

🔍 — cold path concern (pre-existing, not introduced here)

Summary

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Code Review

HookDelegateBuilder.cs — static lambda

TestExecutionGuard.cs — TryGetValue fast path

EventReceiverOrchestrator.cs — First-test event task fast paths

Minor: PR description slightly overstates savings for counter dictionaries

InitializeTestCounts — cold path concern (pre-existing, not introduced here)

Summary

Uh oh!

Uh oh!

hartmair Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

thomhurst Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

thomhurst commented Mar 22, 2026 •

edited

Loading